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Abstract 

This paper describes the implementation and evaluation of an open 
source library for mathematical morphology based on packed binary and 
' , ' run-length compressed images for document imaging applications. Ab- 

Ph stractions and patterns useful in the implementation of the interval oper- 

ations are described. A number of benchmarks and comparisons to bit-blit 
^ based implementations on standard document images are provided. 

1 Introduction 

^ ^ ^ Binary morphology is an important and widely used method in document im- 

age analysis, useful for tasks like image cleaning and noise removal, [21] layout 
I analysis, [23] skew correction, [15] and text line finding. [6] Real-world doc- 

ument analysis systems currently primarily rely on bit blit-based implementa- 
tions. Practical implementations take advantage of separability and logarithmic 

T— I decomposition of rectangular structuring elements [TOIHIITS]. 

This technical report describes a binary morphology library containing both 
a run-length and a packed binary implementation of morphological operations. 
^ A number of the methods described in this paper are very similar to methods 

*k> described in the literature [13l [T9|, although the library was developed inde- 

pently of that literature. The paper will not provide a detailed discussion of the 
similarities and differences of the algorithms described in this memo to those in 
the literature]^ This memo does provide a number of benchmarks that should 
help practitioners choose good algorithms for their particular applications. 

We note that, in addition to run length and packed binary methods, a num- 
ber of other methods have been described in the literature. Binary mathematical 
morphology with convex structuring elements can be computed by propaga- 
tion of distances on the pixel grid using a dynamic programming algorithm |22j 
(brushfire algorithms). Another class of algorithms is based on contours [2T] and 
loop and chain methods [22]. The van Herk/Gil-Werman algorithms [20 ] [TO l [9] 



^Comments and additional references to prior work would be appreciated, however. 
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have constant per-pixel size overhead for grayscale morphology, and binary mor- 
phology can be viewed as a special case. Another class of algorithms is taking 
advantage of anchors, [8] . Although some of these algorithms are competitive 
for gray scale morphology, they have not been demonstrated to be competi- 
tive with high quality bit blit-based implementations for binary morphology on 
packed binary representations [1]. 

Some authors have looked again at grayscale morphology, using more com- 
plex intermediate representations [7j. It remains to be seen how such algorithms 
compare to the algorithms in this paper, both in performance and storage; we 
will not be addressing that question here. 

Bit blit-based implementations at their lowest level take advantage of opera- 
tions that are highly efficient on current hardware because they are used as part 
of many different algorithms and display operations: their running time grows 
quadratically in the resolution of the input image; they do not take advantage 
of coherence in the input image-an almost blank image takes the same amount 
of time to process as a highly detailed image; and operations that need to take 
into account the coordinates of individual pixels (e.g., connected component la- 
beling) often need to decompress (at least on the fly) or use costly pixel access 
functions. 

We will mostly limit ourselves in this paper to the development of morpho- 
logical operations involving rectangular structuring elements. These are by far 
the most common operations in document image analysis. However, the run- 
length method can also be used for implementing morphological operations for 
arbitrary masks; algorithms and performance will be given in a separate paper. 

Converting between run length and non-run length representations can be 
carried out fairly quickly, so we also have the option of mixing run-length and 
bitmap representations. However, many binary image processing algorithms 
can be implemented directly on run length images. In fact, prior work in image 
processing on the line adjacency graph and algorithms operating on it are di- 
rectly transferable. We therefore briefly discuss a number of these algorithms. 
Taken together with the binary morphology operations in this paper, they allow 
complete binary image processing pipelines to be built on run length images, 
with no conversion costs. 

Finally, we give benchmarks and comparisons with the Leptonica library, 
an open source library for morphological image processing. It has compara- 
tively good performance, uses well-documented algorithms, and is used in sev- 
eral large-scale document analysis systems. 

2 Run Length Image Coding 

Run-length image representations have a long history in image processing and 
analysis. They have been used, for example, for efficient storage of binary and 
color images and for skeletonization of large images. 

Consider a ID boolean array a containing pixel values and 1 at each 
location a^. The run length representation r is an array of intervals ri = 
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[si, ei], . . . , r„ = [s„, e„] such that Oi = 1 iff i G for some j and < s^+i. 

The 2D run-length representation we are using in this paper is a straight- 
forward, extension to 2D that treats the two coordinates asymmetricaly; in 
particular, the binary imag^aij is represented as a sequence of one-dimensional 
run-length representations , such that for any fixed iq, the ID array aj = ai^j 
is represented by the ID runlength representation rj — ri^j. 

Even algorithms that are not explicitly using run length representations are 
often still implicitly manipulating runs of pixels internally; for example, the 
usual connected component labeling algorithm internally considers neighbor- 
hood relations between runs of pixels. An extended version of 2D run-length 
representations has been used as the hne adjacency graph (LAG); it adds a 
graph structure encoding neighborhood relations between runs to the basic run 
length encoding; in our algorithms, these neighborhood relations are simply 
inferred dynamically. 

In practice, since all the algorithms described in this paper access runs se- 
quentially, both linked lists or extensible arrays with exponential doubling can 
be used to represent the runs of each line; our implementation uses extensible 
arrays with exponential doubling, which results in fewer calls to the memory al- 
locator, less average memory usage, and better locality of reference than linked 
list representations. In our current implementation, each run is represented as 
a pair of 16 bit integers, allowing images up to approximately 65535 x 65535 
to be represented; other, more efficient coding schemes are possible (e.g., using 
a Unicode- like variable length encoding). Note that, in the worst case, that of 
alternating black and white pixels, the run-length representation may be up to 
16 times bigger than a packed bit representation, or a factor of two compared 
to a one-byte-per-pixel representation. 

On the other hand, run-length encoded images scale linearly with image 
resolution, rather than quadratically. That is, a 1200 dpi binary image takes 
approximately 4 times as much space than a 300 dpi binary image using run- 
length encoding, while a packed bit image would take 16 times as much space. 

3 Morphological Operations 

Because of the asymmetry in the two dimensions of the 2D run- length represen- 
tation we are using, morphological operations behave differently in the x and 
y direction in run-length representations. An analogous asymmetry is found 
in bit-blit operations, in which the bits making up image lines are packed into 
words, and a list of lines represents the entire image. There are multiple pos- 
sible approaches for dealing with this issue. First, we can implement separate 
operations for horizontal and vertical operations. Second, we can implement 
only the within-line operations and then transform the between-line operations 
into within-line operations through transposition. For separable operations, the 

^ This paper and our library uses PostScript/mathematical conventions, with ao,o repre- 
senting the bottom left pixel of the image. 
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second approach is often the easier one. Therefore, an erosion with a rectangular 
structuring element of size u x v can be written asj^ 

function erode2d ( image ,u,v) 

erode Id (image ,u) 

transpose (image) 

erode Id ( image , v) 

transpose (image) 
end 



4 Within-Line Morphological Operations 

There are four basic morphological operations we consider: erosion, dilation, 
opening, and closing. One-dimensional opening and closing are the easiest to 
understand. Essentially, a one-dimensional opening with size u simply deletes 
all runs of pixels that are less than size u large, and leaves all others untouched]^ 

void openld ( image ,u) { 

for i in 1 , length(image . lines) do 
line = image . lines [i] 
filtered = [] 

for j in 1 , length (line . runs) do 
if runs [j] .width >= u then 

filtered. append (line . runs [j] ) 
end 

image . lines [i] = filtered 
end 
end 

A one-dimensional closing with size u deletes all gaps that are smaller than 
size u, joining the neighboring intervals together. It can either be implemented 
directly, or it can be implemented in terms of complementation and opening 

function complement (image) 

for i in 1 , length(image . lines) do 
line = image . lines [i] 
filtered = [] 
last = 

for j in 1 , length (line . runs) do 
run = line.runs[j] 
newrun = make_run(last , run. start) 
filtered. append (newrun) 

Our convention is output arguments before input arguments, and the various procedures 
modify the image in place. 

*We are using 1-based arrays in the pseudo-code. 

^ To simplify boundary conditions, we are using the notation expl or exp2 to mean use 
the value of expl if it is defined, otherwise use 6xp2. 
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last = run. end 

end 

filtered. append (make_run(last .maxint) ) 
image . lines [i] = filtered 
end 
end 

function close Id ( image ,u) 

complement (image) 

openld(image) 

complement (image) 
end 

Note that openings and closing are not separable, so we cannot use these imple- 
mentations directly for implementing true 2D openings and closings; for that, 
we have to combine erosions and dilations. However, even as they are, these sim- 
ple operations are already useful and illustrate the basic idea behind run-length 
morphology: run-length morphology is selective deletion and/or modification of 
pixel runs. 

The most important operation in run-length morphology is one-dimensional 

erosion. Like one-dimensional opening, we walk through the list of runs, but 
instead of only deleting runs smaller than u, we also shrink runs larger than u 
by w/2 on each side (strictly speaking, for erosions on integer grids, we shrink by 
floor(M/2) on the left side and u — floor(u/2) on the right side during erosions), 
and use the opposite convention for dilations). In pseudo-code, we can write 
this as follows: 

function erodeld(image ,u) 

for i in 1 , length(image . lines) do 
line = image . lines [i] 
filtered = [] 

for j in 1 , length (line . runs) do 
if runs [j] .width >= u then 

start = runs [j] . start +u/2 

end = runs [j] . end-u/2 

filtered. append (make.rim (start , end) ) 
end 

image . lines [i] = filtered 
end 
end 

As with opening/closing, dilation can be implemented directly or via comple- 
mentation: 

function dilateld(image ,u) 

complement (image) 
erodeld(image) 
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complement (image) 
end 

In terms of computational efficiency, obviously, all these operations are linear 
in the total number of runs in the image. 

5 Efficient Transpose 

Transposition means that we need to construct runs of pixels in the direction 
perpendicular to the current run-length encoding. A simple way of transposing 
is to essentially decompress each run individually and then accumulate the de- 
compressed bits in a second run length encoded binary image [T] [Tl] . For this, 
we maintain an array of currently open runs in each line of the output image 
and iterate through the runs of the current line in the input image. For the 
range of pixels between the runs of the current line in the input image, we finish 
off the corresponding open runs in the output image. For the range of pixels 
overlapping the runs of the current line in the input image, we start new runs 
for lines where runs are not currently open and continue existing open runs for 
lines where runs are currently open. In terms of pseudo code, that looks as 
follows: 

function trcLnspose_simple (image) 
output = make_rle_image 
open_runs = make_array (new_image_size) 
for i = 1 , length(image . lines) do 

line = image . lines [i] 

last = 1 

for j=l , length (line . runs) do 
run = line.runs[j] 
for k=last , run. start do 

newrun = make_run(open_runs [k] , i) 

output . lines [k] . append (newrun) 

open_runs [k] = nil 
end 

for k=run.start,run.end do 
if open_runs [k] == nil then 
open_runs [k] = i 

end 
end 

last = run. end 
end 
end 

. . . finish off the remaining runs here . . . 
end 

This simple algorithm is usable, but it does not take advantage of the coherence 
between lines in the input image. To take advantage of that, we need a more 
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Figure 1: The figure illustrates a merge step during the transposition step. The 
algorithm maintains a list of open intervals and information about how many 
steps that interval has been open for. It then considers the next run-length 
encoded line in the input. Ranges in the input that do not overlap any intervals 
in the new line are finished and give rise to runs in the output. Ranges in the 
input that overlap runs in the next line give rise to intervals in the open line that 
have their step number incremented by one. Ranges in next line that do not 
correspond to any range in the list of open intervals give rise to new intervals 
with their step values initialized to one. 

complicated algorithm; the algorithm is somewhat similar to the rectangular 
sweeping algorithm used for finding maximal empty rectangles [3]. 

The basic idea behind the transposition algorithm is to replace the array of 
open runs in the above algorithm with a list of runs, each of which represents 
an open run in the perpendicular direction. This is illustrated in Figure [l] The 
actual inner loop is similar to the algorithm shown above for the per-pixel up- 
dating, but because of the 13 possible relationships between two closed intervals, 
the inner loop contains a larger case statement; this will not be reproduced here. 
This new run length transposition algorithm speeds up the overall operation of 
the binary morphology code several-fold relative to the simple decode-recode 
implementation. 

6 Between Line Boolean Operations 

A bit blit based implementation of mathematical morphology uses as its prim- 
itive an operation that performs a logical operation (AND, OR, XOR, AND- 
NOT, OR-NOT, NAND, NOR, etc.) between the pixels of two images, allowing 
for relative shifts (see also [23]). We can implement the same operations in our 
library. The general idea is to consider the runs in the two source lines from left 
to right and merge them together. For example, if the operation is AND, then 
this means deleting any run in either of the two input lines that does not overlap 
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a corresponding run in the other hne. Because of the 13 possible relationships 
between intervals, this is fairly complicated. However, there is a pair of useful 
abstractions that greatly simplifies the implementation of these kinds of inter- 
val merging operations]^ Basically, instead of considering lines as collections of 
runs, we consider them as collections of transitions from black to white and from 
white to black. We operate on those collections using two simple abstractions: 

• A TransitionSource returns locations of transitions in ascending order, 
indicating for each transition whether it is from black to white or from 
white to black. 

• A TransitionSink accepts locations of transitions in ascending order and 
re-assembles them into intervals. 

With these abstractions, the main loop of line.cind becomes simply (in C+-I-): 

TransitionSink sink(out , total) ; 
TransitionSource srcl(ll,0); 
TransitionSource src2(12,of f set2) ; 
int where = MININT; 
bool bl = false; 
bool b2 = false; 
while(srcl I I src2) {_ 

if (srcl . coordO <src2 . coordO ) { 

bl = srcl .value ; 

where = srcl . coordO ; 

srcl .nextO ; 
} else ■[ 

b2 = src2.value() ; 

where = src2 . coordO ; 

src2.next() ; 

> 

sink. append (where, bl&&b2) ; 

> 

The simplification of the code results from the fact that the main loop becomes 
a simple ordered merge of lists of numbers, and the sink data structure takes 
care of the different cases; for example if it receives a sequence of transitions 
{xo,F),{xi,T),{x2,T),{x3,F),{x4,F), it will generate a run from xi to 0:3. 
Boundary conditions are additionally simplified because the TransitionSource 
returns a big integer after running out of locations, which eliminates separate 
code after the main loop to finish off the unfinished line when the other line has 
run out of transitions. 

^ A similar technique can be applied to the transpose operation above, but that operation 
was written by considering the different possible cases directly. 
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7 Efficient Binary Decompositions 



Binary decomposition of linear structuring elements is a widely used tech- 
nique for real-world, fast binary morphology. For reviews of the technique, 
see [HJ m [15] . We can combine the "bit blit-like" operations above with binary 
decomposition for an alternative approach to binary morphology on run-length 
images. 

Generally, a linear structuring element of length n can be decomposed into 

nS^S® E{S) ® E{2S) ® • • • £;(2™-i5) ® E{{n - 2""^)S) (1) 

A simple example of this is the decomposition of the 8Lq structuring element |1 5]: 

8Lo = {• • •} ® {• o •} © {• o o o •} © {• o o o o o o o •} (2) 

This involves 9 pixels, and hence 9 blit operations in a straightforward bit 
blit implementation. This represents a significant savings over the original 19 
pixels of the linear structuring element, but it does not represent the optimal 
decomposition of a structuring element of 19 pixels, which requires just 5 blit 
operations: 

width = 1 ; 

while (2*width<r) { 

bits_and(image, image, width, 0) ; 

width *= 2; 

> 

if (width<r) bits_aiid(image, image, r-width-1,0) ; 

To center the operation properly, we need to shift the image prior to these 
operations. The overall cost of decomposing a line structuring element therefore 
is a shift operation plus [log r] , where r is the width of the structuring element 
in pixels. In the run length morphology library, this approach is applied to the 
between-line morphological operations. 

An alternative loop avoids the initial shift, which causes some undesirable 
behavior at the image border. The idea is to perform exponential doubling to 
cover at least half of the right half of the structuring element, and then finish of 
the operation with three more operations (overall, this requires one extra blit 
relative to the simple version above): 

width = 1 ; 

while (2*width<r/2) { 

bits_and(image, image, width, 0) ; 
width *= 2; 

} 

if (width<r/2) bits_Eind(image , image ,r/2-width-l , 0) ; 
bits_and(image , image ,- (r-r/2) ,0) ; 

if (width<r-r/2) b it s_cLnd( image , image , - (r-r/2)+width, 0) ; 
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8 2D Morphological Operators for Rectangular 
Masks 



Given the three operations developed above, within-line morphological oper- 
ations, transpose, between-linc binary operations, and logarithmic binary de- 
composition, as well as the separability property of rectangular masks, we now 
have various different ways of composition 2D morphological operators using 
rectangular structuring elements: 

• "brute force" implementation: this simply performs a shifted AND or OR 
operation for each pixel in the mask, quite analogous to a brute-force 
bit blit implementation; this is a slow operation useful for reference and 
verification 

• transpose and vjithin-lirie operations: we first perform morphology along 
the within-line direction, then transpose, then perform morphology again, 
and then transpose back 

• within-line and between-line operations: we perform morphology along one 
direction using the within-line operators, and along the other direction 
using logarithmic decomposition and between-line operations; this can be 
carried out in either order 

It is not a priori obvious which of these choices is the most efficient, but it 
turns out experimentally that the last one works fastest for document images. 
Furthermore, it is important to carry out the within-line operations before the 
between-line operations because the former are far more efficient when coping 
with images with many runs. We will return to this issue in the experimental 
section. 

9 Morphology with Arbitrary Masks 

Many structuring elements in practice axe handled as arbitrary bitmasks, using 

a straightforward loop such as: 

for(i=0;i<w;i++) f or(j=0; j<h; { 
if (element (i ,j )==0) continue; 
bit s_and (result , image , -i+cx , - j +cy) ; 
coimt++; 

} 

Obviously, loops like that grow linearly in the number of pixels in the structuring 

element. 

Fortunately, for run-length encodings, we can perform these operations a 
run at a time, rather than a pixel at a time. Essentially, for each run in the 
mask, we perform the operation on a copy of the image, then apply the result 
to an accumulator image, which we finally return. 
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for (run in horizontal_riins (element) ) { 
temp = copy(image); 

erode_line_horizontal (temp , run . x , run . yO , run . y 1) ; 
bits_and(result ,temp) 
count ++; 

} 

Of course, for run length morphology, we can carry out erode_line_horizontal 
directly. 

For bit blit morphology, the naive implementation would use binary de- 
composition to compute erode_line_horizontal separately for each run. For 
completeness (and because it is implemented in the companion bit blit-based 
operations) let us observe that such an approach involves many unnecessary 
copies and recomputations of intermediate results; the overall complexity is 
^runs ■ log(maxjrun_width). A better way to perform decomposition of arbi- 
trary masks is to accumulate longer and longer horizontal structuring elements 
and apply them as needed. The following simple pseudocode illustrates the idea 
(it assumes that the struc;turiiig {^lenient has been flipped prior to the compu- 
tation; majcwidth is the maximum run width); 

temp = copy (image); 

f or (width=l ;width<maxwidth;width*=2) { 
for(run in runs_of _element) { 
run_width = run.yl-run.yO; 

if (run_width >= width kk run_width < 2*width) { 
bit s_cind ( image , t emp , run . x , run . yO ) ; 

if (run_width>width) bits_and(image, temp, run. x, run. yl-run_width-l) ; 

> 

} 

} 

For a circle of radius r, this involves at most 2r + [log maxwidthj calls to the 
bits_and function, plus a copy. A separate shift is not needed to center the 
result since the runs themselves can be offset appropriately. 

10 Scaling, Skewing, and Rotation 

Scaling, skewing, and rotation are other important operations in document im- 
age analysis, used during display and skew correction. 

Scaling can be implemented by scaling the coordinates of each run and scal- 
ing up or down the array holding the lines by deleting or duplicating line arrays. 
Scaling can also be implemented as part of the conversion into an unpacked rep- 
resentation (as required by, for example, window systems). 

Skew operations can be implemented within each line by shifting the start 
and end values associated with each run. Bitmap rotation by arbitrary an- 
gles can then be implemented by the usual decomposition of rotations into a 
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sequence of horizontal and vertical skew operations, using successive applica- 
tion of transposition, line skewing, and transposition in order to achieve skews 
perpendicular to the lines in the run-length representation. We note that this 
method differs substantially from previously published rotation algorithms for 
run length encoded images [5] . 

11 Morphology with Lines at Arbitrary Orien- 
tations 

Finally, let us look at one more case related to lines: morphological operations 
with linear structuring elements at arbitrary angles. A number of papers have 
been written about this, using different approaches. Oddly enough, few if any 
of the papers appear to reference the most obvious approach: bitmap rota- 
tion followed by an axis-aligned structuring element; this would at least be an 
important control experiment |17l 118] : 

bits_erode_liiie (image, r.Eingle) { 
bits_rotate (image , angle) ; 
erode_line_horizontal (image , 2*r) ; 
bits_rotate (image , -angle) ; 

> 

The rotations can be composed from three skew operations, as for general 
bitmap rotations. In fact, the same approach also works for rotated rectan- 
gles. 

However, a rotation of a horizontal line segment will give rise to a rotated 
line segment whose bits differ slightly from the bits generated by rendering a 
digital line segment at that angle using the true linear equations or Bresen- 
ham's algorithm. Furthermore, the skew operations dominate this approach to 
morphological operations with rotated lines. 

Fortunately, there is a simpler, pixel accurate solution: we apply only a single 
skew operation and correct for the change in length. With this, line erosions 
at arbitrary angles between [— f , f ] become (angles outside this range can be 
handled by flips and trasposes): 

bits_erode_line (image, r, angle) { 
skew = taii(angle) ; 
corrected = r*cos (angle) ; 
bits_skew(image , skew) ; 

erode_line_horizontal (image , 2*corrected) ; 
bits_skew(image,-skew) ; 

} 

Here, the function bits_skew moves each pixel at {x,y) to {x,y + skew • x). 
Note that the skew operation is perpendicular to the erosion. It is likely faster 
to skew along each line (via bit shifting the line) and then perform the erosion 
perpendicular to that than the other way around. 
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12 Conversion Between Image Formats 



Conversion to/from run-length encoded representation to either unpacked or 
packed bit-images is straight-forward. We note that input/output can be im- 
pleniented particularly efficiently in terms of run-length image representations, 
since many binary image formats internally already perform some form of run- 
length compression, and their runs can b(^ dircc;tly translated in rims in the in- 
memory representation. Even for file formats that do not use runs, input/output 
can be implemented by compressing and decompressing the image one line at a 
time; that is, we read a line by calling the image decompression library, decode 
the line in an unpacked format into a ID array, and then compute the run-length 
encoding of that array. 

13 Connected Component Extraction and Statis- 
tics 

Connected components, and statistics over them, can also be computed quickly: 

• We associate a label value label [i] [j] with each run lines [i] [j] . 

• For each run in the entire image, we create a set in a union-find data 
structure. 

• We then iterate through all the lines in the image and, for each run in the 
current line merge its label with the labels of any runs in the line above. 
This can be done in linear time in the number of runs in each line. 

• Finally, we renumber the entries in label [i] [j] according to the canon- 
ical set representative from the union-find data structure. 

This is similar to a connected component algorithm on the line ajacency graph 
(but the order in which nodes are explored can be different). It is also similar 
to efficient connected component algorithms operating on bitmap images, but 
runs are used instead of iterating over the pixels or words. 

The output of this process is a set of runs lines [i] [j] and correspond- 
ing labels label [i] [j]. We can also think of these structures together as a 
run-length compressed image with pixel values stored in the label array. For 
computing bounding boxes, ccntroids, moments, boundaries, boundary proper- 
ties, or other spatial statistics over these regions, we can iterate through the runs 
and accumulate the corresponding information in accumulator arrays indexed 
by the labels stored in the label array. 

14 Other Operations 

There are a number of other operations that can be carried out quickly on 
run-length representations: 



13 



• Run-length statistics are frequently used in document analysis to estimate 
character stroke widths, word spacings, and line spacings; they can be 
computed in linear time for both black and white runs by iterating through 
the runs of an image. In the vertical direction, they can be computed by 
first transposing the image. 

• The line adjacency graph can be computed by treating the runs as nodes 
in the graph and creating edges between any runs in adjacent lines if the 
intervals represented by the runs overlap. 

• Standard skeletonization methods for the line adjaceny graph can be ap- 
plied after computation of the LAG as described above. (See also [T5].l 

• Run- length based extraction of lines and circles using the RAST algorithm 
[H] can be apphed directly. 

15 Experiments 

We have implemented, among others, conversions between run-length, packed 
bit, and unpacked bit representations of binary images, transposition, all the 
morphological operations with rectangular structuring elements described above, 
bitmap rotation by arbitrary angles, computation of run-length statistics, con- 
nected component labeling, and bounding box extraction. For evaluating the 
general behavior of these algorithms and determining whether they are fea- 
sible in practice, we are comparing the performance of the run- length based 
algorithms with a companion binary bit blit based morphology package, as 
well as the bitmap-based binary morphology implementation in Leptonica 1.48 
(8/30/07), an open source morphological image processing library in use in pro- 
duction code and containing well-documented algorithms and implementations 

Leptonica contains multiple implementations of binary morphology; the fastest 
general-purpose implementation is pixErodeCompBrick (and analogous names 
for other operations), a method that uses separability and binary decomposition; 
it was used unless otherwise stated. Leptonica also contains partially evaluated 
morphology operators for a number of specific small mask sizes available under 
the names like pixErodeBrickDwa. These were used when applicable. Both 
libraries were compiled with their default (optimized) settings. 

In the experiments, we want to address several questions: 

• What is the scaling behavior of the run length methods? 

• Which of the possible different run length implementations is better? 

• Is any one method uniformly better than the others, or do we need to 
perform algorithm selection? 

• How do these algorithms perform for the types of mask sizes and images 
found in typical document analysis tasks? 
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Figure 2: Average running times (vertical axis) for opening (left) and closing 
(right) with square masks of different size (horizontal axis) of the 1600 document 
image pages from the UW3 document image database; the database consists of 
journal article pages scanned at 300 dpi and binarized. lept — Leptonica bit blit 
based implementation (using pixErodeBrickDwa for sizes < 52), bits — compan- 
ion bit blit library to the run length library, rle = run length-based morphology 
using first within-line then between-line operations, rlet = run length based 
morphology using within line operations only and transpose. 



15.1 2D Rectangular Masks 

To gain some general insights into the behavior of the run length methods for 
real-world document images, the running times of morphological operations on 
the 1600 images of the UW3 [TT] database, 300 dpi binary images of scans of 
degraded journal publication pages, were measured. The results are shown in 
Figure [2] We see that, except for masks of size five or below, the run length 
implementation outperforms the bit blit implementation. 

By choosing at runtime between the bit blit implementation and the run 
length implementation, we can obtain a method that shares the characteris- 
tics of both kinds of images. As already noted above, the cross-over point can 
be determined automatically either based on mask size and dpi, or based on 
output complexity. This is shown as the bold curve in the figures; the curve 
does not coincide the bit blit based running times because the run length fig- 
ures include the conversion times from run length representations to packed bit 
representations and back to run length representations; in many applications, 
these conversion costs can be eliminated. By switching back to bit blit-based 
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Figure 3: A 7000 x 7000 image of a cadastral map used for performance mea- 
surements. 

implementations for small mask sizes, we can combine the two methods into a 
method that gives performance closer to bit blit implementations at small sizes 
while still retaining the advantages of run length methods at large sizes. 

In a second experiment , we compared performance of the run length method 
to Leptonica's bit-blit based morphology on a different document type with a 
binarized 7000 x 7000 pixel cadastral map (Figure |4|. 

15.2 Circular Masks 

As an illustration of the kind of performance achievable using the run length 
methods for general purpose structuring elements, Figure |5] shows the perfor- 
mance on circular and rectangular structuring elements using run-length and 
brute force bit blit morphology. 

As expected, the time for the brute force bit blit morphology grows quadrat- 
ically in the size of the structuring element. Run length morphology grows lin- 
early up to a point where the output complexity (the number of runs in the 
output image) starts decreasing and dominates the running time. 

15.3 Document Analysis Performance 

In the third experiment, we want to illustrate overall performance of run length 
morphology methods as part of a simple morphological layout analysis system. 
The method estimates the inter- word and inter-line spacing of document images 
based on black and white run lengths, then performs erosion operations to 
smear together connected components that are likely to be part of the same 
blocks based on those estimates, and finally computes the bounding boxes of 
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Figure 4: Times for opening (left) and closing (right) the 7000 x 7000 image 
of a cadastral map in Figure |3] with the different sized square masked, lept = 
Leptonica bit blit based implementation (using pixErodeBrickDwa for sizes < 
52) , bits = companion bit blit library to the run length library, rle — run length- 
based morphology using first within- line then between-line operations, rlet= run 
length based morphology using within line operations only and transpose. 

the resulting large connected components; this approach is similar to the one 
in [23] As the input, the 1600 pages from the UW3 database were used. These 
are 300dpi letter sized page images scanned from published journals. Relative 
performance of the run-length based method and Leptonica's bit blit based 
method, including bounding box extraction, are shown in Figure |6] The results 
show that run length morphological algorithms perform about twice as fast at 
300dpi than the bit blit based algorithms in Leptonica (at 600dpi or 1200dpi, 
the advantage of run length methods would be greater still). 

16 Discussion 

The paper describes a number of methods used in our open source binary mor- 
phology library for performing morphological and related operations on run 
length representations. Although there is considerable overlap with previously 
published results, some algorithmic details appear to be not well known, like the 
use of skew operations for linear structuring elements, and the application of 
run-length like decompositions and doubling for bit blit based morphology with 
arbitrary structuring elements. We hope that these methods and implementa- 
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Figure 5: Times for morphological closing of a binary image of a two column 
text page at 600dpi using structuring elements of different size. Both circular 
structuring elements and square structuring elements are shown. 

tions will become a useful reference implementation for other work, as well as a 
practical library for performing binary morphology on document images. 

The performance measurements on real-world document images over a wide 
range of mask sizes, as well as the performance evaluation in the context of a 
complete layout analysis system demonstrate clearly that the run length mor- 
phology is an efficient alternative to bit blit based morphology for realistic doc- 
ument images. Furthermore, comparing 300dpi page images and performance 
on larger cadastral maps, as well as theoretical considerations, suggest that the 
advantage of run length methods increases as the size and resolution of images 
increases. 

The algorithms described in this paper were developed originally for docu- 
ment image applications and have proven useful in a variety of practical applica- 
tions in the years since. Although it is difficult to establish formally, generally, 
run length based algorithms seem to be somewhat easier to implement efficiently 
than bit blits, since the boundary conditions and special cases seem to be sim- 
pler and fewer. Furthermore, run length morphology has no machine word size 
dependencies. 

It will remain for future work to sec how the algorithms presented in this 
paper relate to methods recently proposed in the literature. Van Droogenbroeck 
[7], for example, also describes two algorithms using list or rectangular structures 
to aid in the computation of fast binary morphology. It appears that his methods 
are considerably more complex to implement. Benchmarking and comparison 
of these new methods (including the ones presented in this paper) will have 
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Figure 6: The left panel shows boxplots of the running times of the 
morphological layout analysis system using either the run length method 
(mixed within/bctwccn methods) or Lcptonica's bit blit methods (using 
pixErodeBrickDwa where possible). The right panel shows a smoothed 2D 
histogram of the same data. The system estimates character and line spacing 
from run lengths and then performs a rectangular dilation that merges lines 
and characters into blocks. Finally, it computes bounding boxes of connected 
components. Performance is shown over the 1600 IMGLINES images from the 
UW3 database. 

to take into account both space and running times, since both axe crucial in 
practical applications. Unlike high performance bit blit methods, these new 
methods are also sensitive to the complexity of both the input and output. Van 
Droogenbroeck also raises the issue of implementation complexity; while this is 
hard to quantify, it appears that the run length methods described in this paper 
are easier to implement. All these approaches arc a trend to taking advantage 
of coherence in images, similar to the way compression algorithms do. 

There are a number of obvious directions for extensions. Operations on 
arbitrary morphological masks can be represented as run length images as well, 
and morphology over them can be carried out using similar approaches to those 
described here. All the accesses within the inner loops of these algorithms are 
sequential; this presents opportunities for more compact representations, such 
as variable length multi-byte encodings of run lengths (since shorter run lengths 
tend to be more frequent than longer run lengths). In fact, even Huffman coding 
or in-mcmory zlib compression for each run are possible. 

Let us conclude by examining how these results can be used in practice. 
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Run length morphological operations can be incorporated into systems in var- 
ious ways: (1) a system can use RLE representations and temporarily convert 
to bitmaps when there is a performance advantage (taking into account the 
conversion costs), (2) a system can use bitmaps as its primary representation 
and temporarily switch to RLE when it speeds things up, (3) a system can keep 
everything in RLE format, or (4) a system can continue to keep everything in 
bitmap format. It's clear that, provided the system selects the correct algorithm 
automatically, (2) is no worse than (4) and that (1) no worse than (3) in terms of 
performance, and the paper has, in fact, given examples of speedups using such 
mixed approaches. The experiments presented above on a simple layout analysis 
system also suggest (but don't conclusively prove) that (3) may be faster than 
(4) on average in real- world applications. The question of whether (1) or (2) 
is faster for real-world applications remains to be determined. Many real-world 
imaging applications, such as printing engines, already use run length represen- 
tations internally, and the methods presented in this paper give them the ability 
to integrate and perform morphological operations directly and efficiently. Ex- 
isting bitmap-based libraries like Leptonica may want to choose approach (2) 
to improve performance on large masks without affecting software using the 
library. Furthermore, the run length conversion and operations can be incor- 
porated directly into a blit-like operation, resulting in a hybrid approach; this 
will be explored elsewhere. Overall, run length approaches to binary morphol- 
ogy give us another useful option for implementing morphological operations 
efficiently, in particular in document imaging applications. 
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